Min-Wise Independent Permutations

نویسندگان

  • Andrei Z. Broder
  • Moses Charikar
  • Alan M. Frieze
  • Michael Mitzenmacher
چکیده

We define and study the notion of min-wise independent families of permutations. We say that F ⊆ Sn is min-wise independent if for any set X ⊆ [n] and any x ∈ X, when π is chosen at random in F we have Pr(min{π(X)} = π(x)) = 1 |X| . In other words we require that all the elements of any fixed set X have an equal chance to become the minimum element of the image of X under π. Our research was motivated by the fact that such a family (under some relaxations) is essential to the algorithm used in practice by the AltaVista web index software to detect and filter near-duplicate documents. However, in the course of our investigation we have discovered interesting and challenging theoretical questions related to this concept – we present the solutions to some of them and we list the rest as open problems. ∗Digital SRC, 130 Lytton Avenue, Palo Alto, CA 94301, USA. E-mail: [email protected]. †Computer Science Department, Stanford University, CA 94305, USA. E-mail: [email protected]. Part of this work was done while this author was a summer intern at Digital SRC. Supported by the Pierre and Christine Lamond Fellowship and in part by an ARO MURI Grant DAAH04-96-1-0007 and NSF Award CCR-9357849, with matching funds from IBM, Schlumberger Foundation, Shell Foundation, and Xerox Corporation. ‡Department of Mathematical Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA. Part of this work was done while this author was visiting Digital SRC. Supported in part by NSF grant CCR9530974. E-mail: [email protected] §Digital SRC, 130 Lytton Avenue, Palo Alto, CA 94301, USA. E-mail: [email protected].

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Constructing an Optimal Family of Min-Wise Independent Permutations∗

A family C of min-wise independent permutations is known to be a useful tool of indexing replicated documents on the Web. For any integer n > 0, a family C of permutations on [n] = {1, 2, . . . , n} is said to be min-wise independent if for any (nonempty) X ⊆ [n] and any x ∈ X, Pr(min{π(X)} = π(x)) = ‖X‖−1 when π is chosen uniformly at random from C, where ‖A‖ is the cardinality of a finite set...

متن کامل

On restricted min-wise independence of permutations

A family of permutations F ⊆ Sn with a probability distribution on it is called k-restricted min-wise independent if we have Pr[minπ(X) = π(x)] = 1 |X| for every subset X ⊆ [n] with |X | ≤ k, every x ∈ X , and π ∈ F chosen at random. We present a simple proof of a result of Norin: every such family has size at least ( n−1 ⌊ k−1 2 ⌋ ) . Some features of our method might be of independent interes...

متن کامل

Min-Wise Independent Linear Permutations

A set of permutations F ⊆ Sn is min-wise independent if for any set X ⊆ [n] and any x ∈ X, when π is chosen at random in F we have P (min{π(X)} = π(x)) = 1 |X| . This notion was introduced by Broder, Charikar, Frieze and Mitzenmacher and is motivated by an algorithm for filtering near-duplicate web documents. Linear permutations are an important class of permutations. Let p be a (large) prime a...

متن کامل

Almost K-Wise vs. K-Wise Independent Permutations, and Uniformity for General Group Actions

A family of permutations in Sn is k-wise independent if a uniform permutation chosen from the family maps any sequence of k distinct elements to any sequence of k distinct elements with equal probability. Efficient constructions of k-wise independent permutations are known for k = 2 and k = 3 based on multiply transitive permutation groups but are unknown for k≥ 4. In fact, it is known that the...

متن کامل

Completeness and Robustness Properties of Min-Wise Independent Permutations

We provide several new results related to the concept of minwise independence. Our main result is that any randomized sampling scheme for the relative intersection of sets based on testing equality of samples yields an equivalent min-wise independent family. Thus, in a certain sense, min-wise independent families are \complete" for this type of estimation. We also discuss the notion of robustne...

متن کامل

A Derandomization Using Min-Wise Independent Permutations

Min-wise independence is a recently introduced notion of limited independence, similar in spirit to pairwise independence. The later has proven essential for the derandomization of many algorithms. Here we show that approximate min-wise independence allows similar uses, by presenting a derandomization of the RNC algorithm for approximate set cover due to S. Rajagopalan and V. Vazirani. We also ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Comput. Syst. Sci.

دوره 60  شماره 

صفحات  -

تاریخ انتشار 2000